35 research outputs found

    Astrophysical code migration into Exascale Era

    Full text link
    The ExaNeSt and EuroExa H2020 EU-funded projects aim to design and develop an exascale ready computing platform prototype based on low-energy-consumption ARM64 cores and FPGA accelerators. We participate in the application-driven design of the hardware solutions and prototype validation. To carry on this work we are using, among others, Hy-Nbody, a state-of-the-art direct N-body code. Core algorithms of Hy-Nbody have been improved in such a way to increasingly fit them to the exascale target platform. Waiting for the ExaNest prototype release, we are performing tests and code tuning operations on an ARM64 SoC facility: a SLURM managed HPC cluster based on 64-bit ARMv8 Cortex-A72/Cortex-A53 core design and powered by a Mali-T864 embedded GPU. In parallel, we are porting a kernel of Hy-Nbody on FPGA aiming to test and compare the performance-per-watt of our algorithms on different platforms. In this paper we describe how we re-engineered the application and we show first results on ARM SoC.Comment: 4 pages, 1 figure, 1 table; proceedings of ADASS XXVIII, accepted by ASP Conference Serie

    Simulating realistic disk galaxies with a novel sub-resolution ISM model

    Get PDF
    We present results of cosmological simulations of disk galaxies carried out with the GADGET-3 TreePM+SPH code, where star formation and stellar feedback are described using our MUlti Phase Particle Integrator (MUPPI) model. This description is based on simple multi-phase model of the interstellar medium at unresolved scales, where mass and energy flows among the components are explicitly followed by solving a system of ordinary differential equations. Thermal energy from SNe is injected into the local hot phase, so as to avoid that it is promptly radiated away. A kinetic feedback prescription generates the massive outflows needed to avoid the over-production of stars. We use two sets of zoomed-in initial conditions of isolated cosmological halos with masses (2-3) * 10^{12} Msun, both available at several resolution levels. In all cases we obtain spiral galaxies with small bulge-over-total stellar mass ratios (B/T \approx 0.2), extended stellar and gas disks, flat rotation curves and realistic values of stellar masses. Gas profiles are relatively flat, molecular gas is found to dominate at the centre of galaxies, with star formation rates following the observed Schmidt-Kennicutt relation. Stars kinematically belonging to the bulge form early, while disk stars show a clear inside-out formation pattern and mostly form after redshift z=2. However, the baryon conversion efficiencies in our simulations differ from the relation given by Moster et al. (2010) at a 3 sigma level, thus indicating that our stellar disks are still too massive for the Dark Matter halo in which they reside. Results are found to be remarkably stable against resolution. This further demonstrates the feasibility of carrying out simulations producing a realistic population of galaxies within representative cosmological volumes, at a relatively modest resolution.Comment: 19 pages, 21 figures, MNRAS accepte

    Software acceleration on Xilinx FPGAs using OmpSs@FPGA ecosystem

    Get PDF
    The OmpSs@FPGA programming model allows offloading application functionality to Xilinx Field Programmable Gate Arrays (FPGAs). The OmpSs compiler splits the code (written in C/C++ high level language) in two parts, targeting the host and the FPGA. The first is usually compiled by the GNU Compiler Collection (GCC), while the latter is given to the Xilinx Vivado HLS tool (hereafter HLS) for high level synthesis to VHDL and bitstream used to program the FPGA. OmpSs@FPGA is based on compiler directives, which allow the programmer to annotate the part of the code to automatically exploit all Symmetric MultiProcessor system (smp) and FPGA resource available in the execution platform. This technical report provides both descriptive and hands-on introductions to build application-specific FPGA systems using the high-level OmpSs@FPGA tool. The goal is to give the reader a baseline view of the process of creating an optimized hardware design annotating C-based code with HLS directives. We assume the reader has a working knowledge of C/C++, and familiarity with basic computer architecture concepts (e.g. speedup, parallelism, pipelining)

    Performance and energy footprint assessment of FPGAs and GPUs on HPC systems using Astrophysics application

    Full text link
    New challenges in Astronomy and Astrophysics (AA) are urging the need for a large number of exceptionally computationally intensive simulations. "Exascale" (and beyond) computational facilities are mandatory to address the size of theoretical problems and data coming from the new generation of observational facilities in AA. Currently, the High Performance Computing (HPC) sector is undergoing a profound phase of innovation, in which the primary challenge to the achievement of the "Exascale" is the power-consumption. The goal of this work is to give some insights about performance and energy footprint of contemporary architectures for a real astrophysical application in an HPC context. We use a state-of-the-art N-body application that we re-engineered and optimized to exploit the heterogeneous underlying hardware fully. We quantitatively evaluate the impact of computation on energy consumption when running on four different platforms. Two of them represent the current HPC systems (Intel-based and equipped with NVIDIA GPUs), one is a micro-cluster based on ARM-MPSoC, and one is a "prototype towards Exascale" equipped with ARM-MPSoCs tightly coupled with FPGAs. We investigate the behavior of the different devices where the high-end GPUs excel in terms of time-to-solution while MPSoC-FPGA systems outperform GPUs in power consumption. Our experience reveals that considering FPGAs for computationally intensive application seems very promising, as their performance is improving to meet the requirements of scientific applications. This work can be a reference for future platforms development for astrophysics applications where computationally intensive calculations are required.Comment: 15 pages, 4 figures, 3 tables; Preprint (V2) submitted to MDPI (Special Issue: Energy-Efficient Computing on Parallel Architectures

    Evaluating SoC power efficiency through N-body application

    Get PDF
    Currently, the High Performance Computing (HPC) sector is undergoing a profound phase of innovation, in which the main stopper in order to achieve "exascale" performance is the power-consumption. The usage of "unconventional" low-cost computing systems is therefore of great interest for several scientific communities looking for a better trade-off between performance and power consumption. In this technical report, we make a performance assessment of commodity low-power System on Chip (SoC) using a direct N-body application for astrophysics. We also describe the methodology we have employed to measure the power drained by the application while running. We find that SoC technology could represent a valid alternative to traditional technology for HPC in terms of good trade-off between time-to-solution and energy-to-solution. This work arises in the framework of the ExaNeSt and EuroExa projects, which investigate the design of a SoC-based, low-power HPC architecture with a dedicated interconnection scalable to million of compute units

    Panchromatic spectral energy distributions of simulated galaxies: results at redshift z = 0

    Get PDF
    We present predictions of spectral energy distributions (SEDs), from the UV to the FIR, of simulated galaxies at z = 0. These were obtained by post-processing the results of an N-body+hydro simulation of a cosmological box of side 25 Mpc, which uses the Multi-Phase Particle Integrator (MUPPI) for star formation and stellar feedback, with the grasil-3d radiative transfer code that includes reprocessing of UV light by dust. Physical properties of our sample of \u2dc500 galaxies resemble observed ones, though with some tension at small and large stellar masses. Comparing predicted SEDs of simulated galaxies with different samples of local galaxies, we find that these resemble observed ones, when normalized at 3.6 \u3bcm. A comparison with the Herschel Reference Survey shows that the average SEDs of galaxies, divided in bins of star formation rate (SFR), are reproduced in shape and absolute normalization to within a factor of \u2dc2, while average SEDs of galaxies divided in bins of stellar mass show tensions that are an effect of the difference of simulated and observed galaxies in the stellar mass-SFR plane. We use our sample to investigate the correlation of IR luminosity in Spitzer and Herschel bands with several galaxy properties. SFR is the quantity that best correlates with IR light up to 160 \u3bcm, while at longer wavelengths better correlations are found with molecular mass and, at 500 \u3bcm, with dust mass. However, using the position of the FIR peak as a proxy for cold dust temperature, we assess that heating of cold dust is mostly determined by SFR, with stellar mass giving only a minor contribution. We finally show how our sample of simulated galaxies can be used as a guide to understand the physical properties and selection biases of observed samples

    Response of human HT-29 colorectal tumor cells to extended exposure to bromodeoxyuridine

    Full text link
    Effects of the extended exposure of a human colorectal tumor-cell line (HT-29) to bromodeoxyuridine (BrdUrd) were studied in anticipation of the clinical use of that agent to treat colorectal cancer, particularly as a regionally delivered radiosensitizer. We found that 72-h exposure to a concentration of BrdUrd that is estimated to be locally maintained in the liver (100 μ M ) was significantly cytotoxic with a 3-log reduction in survival. As measured by GC/MS-SIM method, incorporation of BrdUrd into DNA followed an unexpected time course in that continuous exposure to 10 μ M BrdUrd resulted in maximal incorporation at 3 days, after which the extent of incorporated analog fell significantly (despite daily changes of the medium). This finding was apparently due to a greater rate of loss of BrdUrd from the medium at later time points. Flow cytometric analysis using an anti-BrdUrd antibody (IU-4) revealed that antibody binding also peaked and fell off with time. However, at exposure times of >24 h, the timing and extent of this decline were significantly different than had been indicated by the GC/MS method. These results indicate that the quantitative relationship between antibody staining and BrdUrd incorporation changes as drug-exposure time increases and that quantitative studies of anti-BrdUrd antibody binding must be interpreted with caution, especially when extended drug-treatment protocols have been used.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/46921/1/280_2004_Article_BF00694337.pd

    EuroEXA - D2.6: Final ported application software

    Get PDF
    This document describes the ported software of the EuroEXA applications to the single CRDB testbed and it discusses the experiences extracted from porting and optimization activities that should be actively taken into account in future redesign and optimization. This document accompanies the ported application software, found in the EuroEXA private repository (https://github.com/euroexa). In particular, this document describes the status of the software for each of the EuroEXA applications, sketches the redesign and optimization strategy for each application, discusses issues and difficulties faced during the porting activities and the relative lesson learned. A few preliminary evaluation results have been presented, however the full evaluation will be discussed in deliverable 2.8
    corecore